We provide the data and the R code used in the article “Exploring land use prediction errors from area frame survey data” so that readers may be able to reproduce all the figures, tables and statistics presented in the article with the R software.

Please cite the use of our resources as:

Chakir R., Laurent T., Ruiz-Gazen A., Thomas-Agnan C. and Vignes C. (2018). Exploring land use prediction errors from area frame survey data. CSBIGS.

1 Prerequisites

Packages needed:

install.packages(c("rgdal", "maptools", "RColorBrewer", "classInt", "spdep"))

Loading packages:

require("rgdal")  # import spatial data
## Loading required package: rgdal
## Loading required package: sp
## rgdal: version: 1.2-18, (SVN revision 718)
##  Geospatial Data Abstraction Library extensions to R successfully loaded
##  Loaded GDAL runtime: GDAL 2.2.2, released 2017/09/15
##  Path to GDAL shared files: /usr/share/gdal/2.2
##  GDAL binary built with GEOS: TRUE 
##  Loaded PROJ.4 runtime: Rel. 4.9.2, 08 September 2015, [PJ_VERSION: 492]
##  Path to PROJ.4 shared files: (autodetected)
##  Linking to sp version: 1.2-7
require("maptools") # leglabs function
## Loading required package: maptools
## Checking rgeos availability: TRUE
library("RColorBrewer") # brewer.pal function
require("classInt") # function classIntervals
## Loading required package: classInt
## Loading required package: spData
## To access larger datasets in this package, install the spDataLarge
## package with: `install.packages('spDataLarge',
## repos='https://nowosad.github.io/drat/', type='source'))`
require("spdep") # spatial econometric analysis
## Loading required package: spdep
## Loading required package: Matrix

Information about the current R session :

sessionInfo()
## R version 3.4.4 (2018-03-15)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.5 LTS
## 
## Matrix products: default
## BLAS: /usr/lib/openblas-base/libblas.so.3
## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
## 
## locale:
##  [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
##  [5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8   
##  [7] LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] spdep_0.7-7        Matrix_1.2-14      classInt_0.2-3    
## [4] spData_0.2.9.0     RColorBrewer_1.1-2 maptools_0.9-3    
## [7] rgdal_1.2-18       sp_1.3-1          
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.17      knitr_1.20        magrittr_1.5     
##  [4] gmodels_2.16.2    splines_3.4.4     MASS_7.3-49      
##  [7] lattice_0.20-35   stringr_1.3.1     tools_3.4.4      
## [10] grid_3.4.4        nlme_3.1-137      coda_0.19-1      
## [13] e1071_1.6-8       deldir_0.1-15     gtools_3.5.0     
## [16] htmltools_0.3.6   class_7.3-14      yaml_2.1.19      
## [19] rprojroot_1.3-2   digest_0.6.15     evaluate_0.10.1  
## [22] rmarkdown_1.9     gdata_2.18.0      stringi_1.2.3    
## [25] compiler_3.4.4    LearnBayes_2.15.1 backports_1.1.2  
## [28] boot_1.3-20       expm_0.999-2      foreign_0.8-70

2 Importing the data

2.1 Point levels data

The main data (point levels data) is given in the csv file “./data/point_level.csv”. It contains 502205 observations and 21 variables:

  • Ui: value of the categorical variable “simulated land use” (1 = urban, 2 = farming, 3 = forests, 4 = pastures, 5 = natural land),
  • pi1, pi2, pi3, pi4, pi5: the DGP probabilities obtained using the regression tree (see Figure 2),
  • clc2: corine land cover (in 13 classes, see http://www.eea.europa.eu/publications/COR0-landcover),
  • altitude: altitude in meters (source: BDAlti, IGN),
  • land.price: land price in euros/km2 (source: Agreste),
  • pop: population density in inhabitant/km2 (source: INSEE),
  • INSEE_COM: the id of the commune,
  • teruti: a boolean with TRUE if the point is used in the French Teruti-Lucas database,
  • numseg: the id of the segment,
  • A6, A5, A4, A3, A2, A03, A02, A01 : the id of the regular square grids at several scales (see Chakir et al (2016), Spatial scale in land use models: Application to the Teruti-Lucas survey, Spatial Statistics, 18(A), 246–262, doi:10.1016/j.spasta.2016.06.009).

For importing the point level data:

point.level <- read.csv2("./data/point_level.csv")

For transforming the data into Spatial object:

coordinates(point.level) <- ~ x + y
proj4string(point.level) <- CRS("+proj=utm +zone=31 +ellps=GRS80 
                                +units=m +no_defs")

Please, note that he CRS used for all the geographical coordinates is:

CRS("+proj=utm +zone=31 +ellps=GRS80 +units=m +no_defs")

2.2 The boundaries of different administrative areas

The boundaries of different administrative areas are included in the following folders:

  • ./data/cities: a shapefile containing the boundaries of the communes of France in Midi-Pyrénées region (source: IGN),
  • ./data/dept: a shapefile containing the boundaries of the departements in Midi-Pyrénées region (source: IGN),
  • ./data/segments: a shapefile containing the boundaries of the segment levels data.

For importing the shapes of the departements:

dept <- readOGR(dsn = "./data/dept", layer = "dept")
## OGR data source with driver: ESRI Shapefile 
## Source: "/media/laurent/DATAPART1/Tibo - Codes/Codes CSBIGS/csbigs/data/dept", layer: "dept"
## with 8 features
## It has 11 fields
## Integer64 fields read as strings:  X_CHF_LIEU Y_CHF_LIEU X_CENTROID Y_CENTROID
proj4string(dept)
## [1] "+proj=utm +zone=31 +ellps=GRS80 +units=m +no_defs"

For importing the shapes of the cities:

cities <- readOGR(dsn = "./data/cities", layer = "cities")
## OGR data source with driver: ESRI Shapefile 
## Source: "/media/laurent/DATAPART1/Tibo - Codes/Codes CSBIGS/csbigs/data/cities", layer: "cities"
## with 3020 features
## It has 18 fields
## Integer64 fields read as strings:  X_CHF_LIEU Y_CHF_LIEU X_CENTROID Y_CENTROID Z_MOYEN SUPERFICIE POPULATION
proj4string(cities)
## [1] "+proj=utm +zone=31 +ellps=GRS80 +units=m +no_defs"

For importing the shapes of the segments

segment.level <- readOGR(dsn = "./data/segments", layer = "segments")
## OGR data source with driver: ESRI Shapefile 
## Source: "/media/laurent/DATAPART1/Tibo - Codes/Codes CSBIGS/csbigs/data/segments", layer: "segments"
## with 2579 features
## It has 6 fields
## Integer64 fields read as strings:  numseg
proj4string(segment.level)
## [1] "+proj=utm +zone=31 +ellps=GRS80 +units=m +no_defs"

3 Preparing the data

3.1 Preparation of point level data

Step 1) dichotomization of Ui by creating variables di1, di2, di3, di4, di5:

di <- model.matrix(~ as.factor(point.level@data$Ui) - 1)
point.level@data[,c("di1", "di2", "di3", "di4", "di5")] <- di

Step 2) we create the absolute response error \(|p_{ik} - d_{ik}|\):

point.level@data$erreur1.abs <- abs(point.level@data$pi1 - point.level@data$di1)
point.level@data$erreur2.abs <- abs(point.level@data$pi2 - point.level@data$di2)
point.level@data$erreur3.abs <- abs(point.level@data$pi3 - point.level@data$di3)
point.level@data$erreur4.abs <- abs(point.level@data$pi4 - point.level@data$di4)
point.level@data$erreur5.abs <- abs(point.level@data$pi5 - point.level@data$di5)

Step 3) we create the relative response error \(\frac{|p_{ik} - d_{ik}|}{p_{ik}}\):

point.level@data$erreur1.rel <- point.level@data$erreur1.abs/point.level@data$pi1
point.level@data$erreur2.rel <- point.level@data$erreur2.abs/point.level@data$pi2
point.level@data$erreur3.rel <- point.level@data$erreur3.abs/point.level@data$pi3
point.level@data$erreur4.rel <- point.level@data$erreur4.abs/point.level@data$pi4
point.level@data$erreur5.rel <- point.level@data$erreur5.abs/point.level@data$pi5

Step 4) we create the Gini-Simpson impurity index gi (noted \(gs_i\) in the article):

point.level@data$gi <- 1 - rowSums(point.level@data[, c("pi1","pi2","pi3",
                                                        "pi4","pi5")]^2)

3.2 Preparation of areal level data

Step 1) we create the average of point level Gini-Simpson index gi (noted \(\overline{gs}_g\) in the article):

res <- aggregate(point.level@data$gi, 
                 by = list(numseg = point.level@data$numseg), FUN = mean)
colnames(res)[2] <- "gi"
segment.level <- merge(segment.level, res, by = "numseg")

Step 2) we create the average at areal level of pik: pk.pA (noted \(\bar{g}_{gk}\) in the article):

res <- aggregate(point.level@data[,c("pi1", "pi2", "pi3", "pi4", "pi5")],
                 by = list(numseg = point.level@data$numseg), FUN = mean)
colnames(res)[2:6] <- c("p1.pA", "p2.pA", "p3.pA", "p4.pA", "p5.pA")
segment.level <- merge(segment.level, res, by = "numseg")

Step 3) we create the average at areal level of dik: pk.dA (noted \(\bar{d}_{gk}\) in the article):

res <- aggregate(point.level@data[,c("di1", "di2", "di3", "di4", "di5")],
                 by = list(numseg = point.level@data$numseg), FUN = mean)
colnames(res)[2:6] <- c("p1.dA", "p2.dA", "p3.dA", "p4.dA", "p5.dA")
segment.level <- merge(segment.level, res, by = "numseg")

Step 4) we create the average point level absolute response error:

res <- aggregate(point.level@data[, c("erreur1.abs", "erreur2.abs", 
                                      "erreur3.abs", "erreur4.abs", 
                                      "erreur5.abs")],
                 by = list(numseg = point.level@data$numseg), FUN = mean)
segment.level <- merge(segment.level, res, by = "numseg")

Step 5) we create the areal level absolute response error:

segment.level@data$erreur1.abs.areal <- abs(segment.level@data$p1.dA - 
                                              segment.level@data$p1.pA)
segment.level@data$erreur2.abs.areal <- abs(segment.level@data$p2.dA - 
                                              segment.level@data$p2.pA)
segment.level@data$erreur3.abs.areal <- abs(segment.level@data$p3.dA - 
                                              segment.level@data$p3.pA)
segment.level@data$erreur4.abs.areal <- abs(segment.level@data$p4.dA - 
                                              segment.level@data$p4.pA)
segment.level@data$erreur5.abs.areal <- abs(segment.level@data$p5.dA - 
                                              segment.level@data$p5.pA)
rm(res)

4 Reproducing the figures, tables and comments

4.0.1 Figure 1

op <- par(mar = c(0.5, 0.5, 0.5, 0.5))
plot(cities, axes = T, xlim = c(370000, 380000), ylim = c(4825000, 4835000),
     border = "grey", xaxt = "n", yaxt = "n")
plot(segment.level, lwd = 2, add = T)
pal <- c("red", "gray")
plot(point.level, col = pal[ifelse(point.level@data$teruti, 1, 2)], cex = 0.3,
     add = T)
par(op)
rm(pal)

Figure 1: Areal level grid (in black) and points (Teruti-Lucas locations in red) in Toulouse area.

4.0.2 Figure 2

The figure 2 has been obtained on confidential data. Hence, codes are not given in here.

Figure 2: Classification tree chosen for the DGP.

4.0.3 Figure 3

op <- par(mfrow = c(2, 3), mar = c(2, 4, 1.5, 0.5))
plot(table(round(point.level@data$pi1, 3)), las = 1, ylab = "Frequency", 
     lwd = 5, lend = "square", xlim = c(0, 0.85), cex.axis = 0.8)
title(expression(paste("Urban: ",p[i1])), cex.main = 1.5)
plot(table(round(point.level@data$pi2, 3)), las = 1, ylab = "Frequency", 
     lwd = 5, lend = "square", xlim = c(0, 0.85), cex.axis = 0.8)
title(expression(paste("Farming: ", p[i2])), cex.main = 1.5)
plot(table(round(point.level@data$pi3, 3)), las = 1, ylab = "Frequency", 
     lwd = 5, lend = "square", xlim = c(0, 0.85), cex.axis = 0.8)
title(expression(paste("Forests: ", p[i3])), cex.main = 1.5)
plot(table(round(point.level@data$pi4, 3)), las = 1, ylab = "Frequency", 
     lwd = 5, lend = "square", xlim = c(0, 0.85), cex.axis = 0.8)
title(expression(paste("Pastures: ", p[i4])), cex.main = 1.5)
plot(table(round(point.level@data$pi5, 3)), las = 1, ylab = "Frequency", 
     lwd = 5, lend = "square", xlim = c(0, 0.85), cex.axis = 0.8)
title(expression(paste("Natural land: ", p[i5])), cex.main = 1.5)
par(op)

Figure 3: Bar charts of \(p_{ik}\).

4.0.4 Comments in the text (in section 3.2. The DGP probabilities)

knots(ecdf(point.level@data$pi1))
ecdf(point.level@data$pi1)(0.072)

Comment: ``Most of them are below 0.10 with \(95.4\%\) of values less than \(0.072\) for \(p_{i1}\)’’.

knots(ecdf(point.level@data$pi5))
ecdf(point.level@data$pi5)(0.084)

Comment: ``\(90.0\%\) of the values less than 0.084 for \(p_{i5}\)’’.

round(table(round(point.level@data$pi2, 3) == 0.726)/
        length(point.level@data$pi2)*100, 1)

Comment: ``\(23.7\%\) of the values equal to 0.726 for farming’’.

round(table(round(point.level@data$pi3, 3) == 0.822)/
        length(point.level@data$pi3)*100, 1)

Comment: ``\(26.0\%\) of the values equal to 0.822 for forests’’.

round(table(round(point.level@data$pi4, 3) == 0.570)/
        length(point.level@data$pi4)*100, 1)

Comment: \(21.9\%\) of the values equal to 0.570 for pastures’’.

4.0.5 Figure 4

op <- par(mfrow = c(2, 3), mar = c(4, 4, 1.8, 0.5))
plot(ecdf(point.level@data$pi1), main = "", xlab = expression(p[i1]),
     ylab = "ECDF", cex = 0.8, xlim = c(0, 0.85))
title("Urban", cex.main = 1.5)
plot(ecdf(point.level@data$pi2), main = "", xlab = expression(p[i2]),
     ylab = "ECDF", cex = 0.8, xlim = c(0, 0.85))
title("Farming", cex.main = 1.5)
plot(ecdf(point.level@data$pi3), main = "", xlab = expression(p[i3]),
     ylab = "ECDF", cex = 0.8, xlim = c(0, 0.85))
title("Forests", cex.main = 1.5)
plot(ecdf(point.level@data$pi4), main = "", xlab = expression(p[i4]),
     ylab = "ECDF", cex = 0.8, xlim = c(0, 0.85))
title("Pastures", cex.main = 1.5)
plot(ecdf(point.level@data$pi5), main = "", xlab = expression(p[i5]),
     ylab = "ECDF", cex = 0.8, xlim = c(0, 0.85))
title("Natural land", cex.main = 1.5)
par(op)

Figure 4: Empirical cumulative functions of \(p_{ik}\).

4.0.6 Figure 5

gris <- brewer.pal(5, "Greys")
bk <- c(0, 0.20, 0.40, 0.60, 0.80, 1)
op <- par(mfrow = c(2, 3), mar = c(0, 0, 2, 0), oma = c(0, 0, 0, 0))
plot(segment.level, border = "gray", 
  col = gris[findInterval(segment.level@data$p1.pA, bk, all.inside = T)])
plot(dept, add = T)
title("Urban", cex.main = 2)
plot(segment.level, border = "gray", 
 col = gris[findInterval(segment.level@data$p2.pA, bk, all.inside = T)])
plot(dept, add = T)
title("Farming", cex.main = 2)
plot(segment.level, border = "gray", 
 col = gris[findInterval(segment.level@data$p3.pA, bk, all.inside = T)])
plot(dept, add = T)
title("Forests", cex.main = 2)
plot(segment.level, border = "gray", 
 col = gris[findInterval(segment.level@data$p4.pA, bk, all.inside = T)])
plot(dept, add = T)
title("Pastures", cex.main = 2)
plot(segment.level, border = "gray", 
 col = gris[findInterval(segment.level@data$p5.pA, bk, all.inside = T)])
plot(dept, add = T)
title("Natural land", cex.main = 2)
plot(rep(1, 5), rev(seq(2, 4, 0.5)), axes = FALSE, xlim = c(0, 3),
 ylim = c(1, 5), pch = 15, cex = 3, col = gris, xlab = "", ylab = "")
text(1.8, 4.5, expression(bar(p)[gk]), cex = 2)
text(rep(1.2, 5), rev(seq(2, 4, 0.5)), 
     c("[0-0.2[", "[0.2-0.4[", "[0.4-0.6[", "[0.6-0.8[", "[0.8-1]"),
 pos = 4, offset = 0.2, cex = 1.5)
par(op)
rm(gris, bk)

Figure 5: DGP probabilities \(\bar{p}_{gk}\).

4.0.7 Figure 6

op <- par(mar = c(4, 4, 0.5, 0.5))
plot(table(round(point.level@data$gi,3)), las = 1, lwd = 5, lend = "square",
 cex.axis = 0.7, ylab = "Frequency",
 xlab = expression(paste("Point level Gini-Simpson impurity index (", gs[i],")")))
par(op)

Figure 6: Bar chart of point level GS impurity index (\(gs_i\)).

4.0.8 Table 2

Step 1) Recoding of gi in 8 groups according to the bar chart and the contingency table: gi.cl variable

point.level@data$gi.cl <- cut(point.level@data$gi,
                              breaks = c(0.3, 0.32, 0.38, 0.46, 0.463, 
                                         0.51, 0.61, 0.62, 0.722))
levels(point.level@data$gi.cl) <- c("0.314", "0.370", "0.450", "0.462", 
                                    "0.509", "0.604", "0.619", "0.682-0.721")
table(point.level@data$gi.cl)

Step 2) Contingency table landuse (Ui vs Gini) in 9 groups (gi.cl)

addmargins(table(point.level@data$Ui, point.level@data$gi.cl))

Obtaining row percents:

round(addmargins(prop.table(table(point.level@data$Ui, point.level@data$gi.cl), 
                            margin = 1))*100, 1)

Obtaining column percents:

round(addmargins(prop.table(table(point.level@data$Ui, point.level@data$gi.cl),
                            margin = 2))*100, 1)

The column percent table is used to complete Table 2: Characteristics of groups according to the \(gs_i\) value (2 first columns).

4.0.9 Figure 7

op <- par(mar = c(4.5, 4.5, 0.5, 0.5))
plot(factor(point.level@data$Ui), point.level@data$gi, xlab = "land use",
 ylim = c(0.3, 0.75), axes = F,
 ylab = expression(paste("Point level Gini-Simpson impurity index   ", 
                         gs[i])))
axis(1, at = seq(1, 5), labels = c("urban", "farming", "forests", 
                                   "pastures", "natural land"))
axis(2)
box()
par(op)

Figure 7 : Boxplots of \(gs_i\) by land use.

4.0.10 Figure 8

blue.pal <- brewer.pal(5, "Blues")
bk <- round(classIntervals(segment.level@data$gi, n = 5, style = "kmeans")$brks,
            digits = 2)
op <- par(mar = c(0, 0, 0, 0))
plot(segment.level, border = "gray",
     col = blue.pal[findInterval(segment.level@data$gi, 
                                 bk, all.inside = T)])
legend("topleft", legend = leglabs(bk), fill = blue.pal, inset = 0.05, 
       title = expression(bar(gs)[g]))
par(op)

Figure 8: Map of \(\bar{gs}_g\), the mean of \(gs_i\) at areal level.

4.0.11 Comments given in Section 4.2.

Spatial analysis of the areal level Gini-Simpson index gi (\(\bar{gs}_g\) in the article).

Step 1) Standardization of gi

x <- as.vector(scale(segment.level$gi))

Step 2) Creation of the neighborhood matrix (8 nearest neighbors), normalized

colw <- nb2listw(knn2nb(knearneigh(coordinates(segment.level), k = 8)), 
                 style = "W")

Step 3) Compute the Moran test

# 1. "free sampling model" version
# highly significant (p-value < 2.2e-16): presence of spatial autocorrelation
# normalized Moran I = 0.59
moran.test(x, colw, randomisation = FALSE)
## 
##  Moran I test under normality
## 
## data:  x  
## weights: colw    
## 
## Moran I statistic standard deviate = 60.053, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##      5.851341e-01     -3.878976e-04      9.506434e-05
# 2. permutation test: "randomization model" version
# significant at 1%: presence of spatial autocorrelation, Moran I = 0.59
set.seed(1234)
moran.mc(x, colw, nsim = 99)
## 
##  Monte-Carlo simulation of Moran I
## 
## data:  x 
## weights: colw  
## number of simulations + 1: 100 
## 
## statistic = 0.58513, observed rank = 100, p-value = 0.01
## alternative hypothesis: greater

4.0.12 Figure 9

op <- par(mar = c(4, 4.5, 0.5, 0.5))
res.moran <- moran.plot(x, colw, xlab = expression(bar(gs)[g]),
                        ylab = expression(paste("W x ", bar(gs)[g])),
                        labels = F, pch = 20, cex = 0.6, cex.axis = 0.8, 
                        cex.lab = 0.9)
par(op)

Figure 9: Moran scatterplot of the Gini-Simpson index \(\bar{gs}_g\) (after standardization).

4.0.13 Figure 10

col.quad <- c("brown3", "bisque", "darkcyan", "darkseagreen1")
wx <- lag.listw(colw, x)

x1 <- which((x > mean(x)) & (wx > mean(wx)))    # HH
x2 <- which((x > mean(x)) & (wx < mean(wx)))    # HL
x3 <- which((x < mean(x)) & (wx < mean(wx)))    # LL
x4 <- which((x < mean(x)) & (wx > mean(wx)))    # LH

i.col <- rep("white", length(x))
i.col[x1] <- col.quad[1]
i.col[x2] <- col.quad[2]
i.col[x3] <- col.quad[3]
i.col[x4] <- col.quad[4]

influ <- rep("", length(x))
influ[apply(res.moran$is.inf, 1, any)] <- 
  row.names(segment.level)[apply(res.moran$is.inf, 1, any)]

op <- par(mar = c(0, 0, 0, 0))
plot(segment.level, col = i.col, cex = c(0.1, 1)[as.integer(influ == "") + 1],
     border = "gray")
plot(dept, lwd = 2, add = T)
legend("topleft", c("H-H", "H-L", "L-L", "L-H"), fill = col.quad, 
       title = "Cluster", inset = 0.05)
par(op)

Figure 10: Clusters of the Gini-Simpson index \(\bar{gs}_g\).

4.0.14 Figure 11

resI <- localmoran(x, colw)
resI1 <- which((resI[, 5] < 0.05) & (x > mean(x)) & (wx > mean(wx)))    # HH
resI2 <- which((resI[, 5] < 0.05) & (x > mean(x)) & (wx < mean(wx)))    # HL
resI3 <- which((resI[, 5] < 0.05) & (x < mean(x)) & (wx < mean(wx)))    # LL
resI4 <- which((resI[, 5] < 0.05) & (x < mean(x)) & (wx > mean(wx)))    # LH

p.col <- rep("white", length(x))
p.col[resI1] <- col.quad[1]
p.col[resI2] <- col.quad[2]
p.col[resI3] <- col.quad[3]
p.col[resI4] <- col.quad[4]

op <- par(mar = c(0, 0, 0, 0))
plot(segment.level, col = p.col, border = "gray")
plot(dept, lwd = 2, add = T)
legend("topleft", c("H-H", "L-L"), fill = c("brown3", "darkcyan"), 
       title = "Cluster", inset = 0.05)
par(op)

rm(x, colw, wx, res.moran, col.quad, x1, x2, x3, x4, i.col, influ, resI, resI1,
   resI2, resI3, resI4, p.col)

Figure 11: Segments with significant LISA for the \(\bar{gs}_g\).

4.0.15 Table 3

Step 1) Absolute error when \(d_{ik}=0\) (1st part of the table)

round(summary(point.level@data$erreur1.abs[point.level@data$di1 == 0]), 2)
round(summary(point.level@data$erreur2.abs[point.level@data$di2 == 0]), 2)
round(summary(point.level@data$erreur3.abs[point.level@data$di3 == 0]), 2)
round(summary(point.level@data$erreur4.abs[point.level@data$di4 == 0]), 2)
round(summary(point.level@data$erreur5.abs[point.level@data$di5 == 0]), 2)

length(point.level@data$erreur1.abs[point.level@data$di1 == 0])
length(point.level@data$erreur2.abs[point.level@data$di2 == 0])
length(point.level@data$erreur3.abs[point.level@data$di3 == 0])
length(point.level@data$erreur4.abs[point.level@data$di4 == 0])
length(point.level@data$erreur5.abs[point.level@data$di5 == 0])

Step 2) Absolute error when \(d_{ik}=1\) (2nd part of the table)

round(summary(point.level@data$erreur1.abs[point.level@data$di1 == 1]), 2)
round(summary(point.level@data$erreur2.abs[point.level@data$di2 == 1]), 2)
round(summary(point.level@data$erreur3.abs[point.level@data$di3 == 1]), 2)
round(summary(point.level@data$erreur4.abs[point.level@data$di4 == 1]), 2)
round(summary(point.level@data$erreur5.abs[point.level@data$di5 == 1]), 2)

length(point.level@data$erreur1.abs[point.level@data$di1==1])
length(point.level@data$erreur2.abs[point.level@data$di2==1])
length(point.level@data$erreur3.abs[point.level@data$di3==1])
length(point.level@data$erreur4.abs[point.level@data$di4==1])
length(point.level@data$erreur5.abs[point.level@data$di5==1])

Step 3) Relative error when \(d_{ik}=1\) (3rd part of the table)

round(summary(point.level@data$erreur1.rel[point.level@data$di1 == 1]), 2)
round(summary(point.level@data$erreur2.rel[point.level@data$di2 == 1]), 2)
round(summary(point.level@data$erreur3.rel[point.level@data$di3 == 1]), 2)
round(summary(point.level@data$erreur4.rel[point.level@data$di4 == 1]), 2)
round(summary(point.level@data$erreur5.rel[point.level@data$di5 == 1]), 2)

length(point.level@data$erreur1.rel[point.level@data$di1 == 1])
length(point.level@data$erreur2.rel[point.level@data$di2 == 1])
length(point.level@data$erreur3.rel[point.level@data$di3 == 1])
length(point.level@data$erreur4.rel[point.level@data$di4 == 1])
length(point.level@data$erreur5.rel[point.level@data$di5 == 1])

Table 3: Descriptives statistics of absolute and relative error at point level.

4.0.16 Figure NOT included in this article but in Chakir et al (2016c)

``Another plot for exploring the absolute response error is proposed by Haaf et al. (2014) and can be found in Chakir et al. (2016b) for our data set of interest. It is called the Cumulative Distribution Function of Error Tolerance (CDFET) and consists in the empirical cumulative distribution function of the response error for each land use.’’

CDFET1 <- ecdf(point.level@data$erreur1.abs)
CDFET2 <- ecdf(point.level@data$erreur2.abs)
CDFET3 <- ecdf(point.level@data$erreur3.abs)
CDFET4 <- ecdf(point.level@data$erreur4.abs)
CDFET5 <- ecdf(point.level@data$erreur5.abs)

op <- par(mar = c(4, 4, 0.5, 0.5))
plot(CDFET1, verticals = T, col = "red", main = "", xlim = c(0, 1),
     ylab = "CDFET", cex = 0.8, 
     xlab = expression(paste("Pointwise absolute response error: ",
                             group("|",d[ik] - p[ik], "|"))))
plot(CDFET2, verticals = T, col = "yellow", main = "", cex = 0.8, add = T)
plot(CDFET3, verticals = T, col = "darkgreen", main = "", cex = 0.8, add = T)
plot(CDFET4, verticals = T, col = "palegreen", main = "", cex = 0.8, add = T)
plot(CDFET5, verticals = T, col = "grey", main = "", cex = 0.8, add = T)
legend("right", legend = c("urban", "farming", "forests", "pastures", 
                           "natural land"), lty = 1, 
       col = c("red", "yellow", "darkgreen", "palegreen", "grey"), 
       inset = 0.01)
par(op)

rm(CDFET1, CDFET2, CDFET3, CDFET4, CDFET5)

4.0.17 Figure 12

Figure 12 : Absolute response error vs Gini-Simpson index \(g_i\).

1st row of the figure:

op <- par(mfrow = c(1, 5), oma = c(0, 0.3, 1.5, 0), mar = c(1.1, 2.3, 3.5, 0),
          mai = c(0.05, 0.13, 0.3, 0.0), omi = c(0, 0.5, 0.05, 0))
bp <- with(subset(point.level@data, di1 == 1), 
           boxplot(erreur1.abs ~ gi.cl, xaxt = "n", las = 1, 
                   ylim = c(0, 1.1), cex.axis = 1.5))
mtext("urban observed", side = 2, line = 3, cex = 1.3, font = 2)
text(2.5, 1.1, expression(paste(group("|", d[i1] - p[i1], "|"), " when ",
                                d[i1], "=1")), cex = 1.5)
axis(1, at = 1:8, labels = FALSE)
title("error on urban", cex.main = 1.9)
bp <- with(subset(point.level@data, di1 == 1), 
           boxplot(erreur2.abs ~ gi.cl, xaxt = "n", yaxt = "n", 
                   las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i2] - p[i2], "|"), " when ",
                                d[i1],"=1")), cex = 1.5)
title("error on farming", cex.main = 1.9)
bp <- with(subset(point.level@data, di1 == 1), 
           boxplot(erreur3.abs ~ gi.cl, xaxt = "n", yaxt = "n",
                   las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i3] - p[i3], "|"), " when ",
                                d[i1], "=1")), cex = 1.5)
title("error on forests", cex.main = 1.9)
bp <- with(subset(point.level@data, di1 == 1), 
           boxplot(erreur4.abs ~ gi.cl, xaxt = "n", yaxt = "n", 
                   las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i4] - p[i4], "|"), " when ",
                                d[i1], "=1")), cex = 1.5)
title("error on pastures", cex.main = 1.9)
bp <- with(subset(point.level@data, di1 == 1), boxplot(erreur5.abs ~ gi.cl,
 xaxt = "n", yaxt = "n", las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i5] - p[i5],"|"), " when ",
                                d[i1], "=1")), cex = 1.5)
title("error on natural land", cex.main = 1.9)
par(op)

2nd row of the figure:

op <- par(mfrow = c(1, 5), oma = c(0, 0, 0, 0), mar = c(1.1, 2.0, 1.8, 0),
          mai = c(0.05, 0.1, 0.05, 0.0), omi = c(0, 0.5, 0, 0))
bp <- with(subset(point.level@data, di2 == 1), 
           boxplot(erreur1.abs ~ gi.cl, xaxt = "n", las = 1, 
                   ylim = c(0, 1.1), cex.axis = 1.5))
mtext("farming observed", side = 2, line = 3, cex = 1.3, font = 2)
text(2.5, 1.1, expression(paste(group("|", d[i1] - p[i1], "|"), " when ",
                                d[i2], "=1")), cex = 1.5)
axis(1, at = 1:8, labels = FALSE)
bp <- with(subset(point.level@data, di2 == 1), 
           boxplot(erreur2.abs ~ gi.cl, xaxt = "n", yaxt = "n", 
                   las = 1, ylim = c(0, 1.1), cex.axis = 1.5))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i2] - p[i2],"|"), " when ",
                                d[i2], "=1")), cex = 1.5)
bp <- with(subset(point.level@data, di2 == 1), 
           boxplot(erreur3.abs ~ gi.cl, xaxt = "n", yaxt = "n", 
                   las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i3] - p[i3], "|"), " when ",
                                d[i2], "=1")), cex = 1.5)
bp <- with(subset(point.level@data, di2 == 1), 
           boxplot(erreur4.abs ~ gi.cl, xaxt = "n", yaxt = "n", 
                   las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i4] - p[i4], "|"), " when ",
                                d[i2], "=1")), cex = 1.5)
bp <- with(subset(point.level@data,di2==1), 
           boxplot(erreur5.abs ~ gi.cl, xaxt = "n", 
                   yaxt = "n", las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i5] - p[i5], "|"), " when ",
                                d[i2], "=1")), cex = 1.5)
par(op)

3rd row of the figure:

op <- par(mfrow = c(1, 5), oma = c(0, 0, 0, 0), mar = c(1.1, 2.0, 1.8, 0),
          mai = c(0.05, 0.1, 0.05, 0.0), omi = c(0, 0.5, 0, 0))
bp <- with(subset(point.level@data, di3 == 1), 
           boxplot(erreur1.abs ~ gi.cl, xaxt = "n", las = 1,
                   ylim = c(0, 1.1), cex.axis = 1.5))
mtext("forests observed", side = 2, line = 3, cex = 1.3, font = 2)
text(2.5, 1.1, expression(paste(group("|", d[i1] - p[i1], "|"), " when ",
                                d[i3], "=1")), cex = 1.5)
axis(1, at = 1:8, labels = FALSE)
bp <- with(subset(point.level@data, di3 == 1), 
           boxplot(erreur2.abs ~ gi.cl, xaxt = "n", yaxt = "n", 
                   las = 1, ylim = c(0, 1.1), cex.axis = 1.5))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i2] - p[i2],"|"), " when ",
                                d[i3], "=1")), cex = 1.5)
bp <- with(subset(point.level@data, di3 == 1), boxplot(erreur3.abs~gi.cl,
 xaxt = "n", yaxt = "n", las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i3] - p[i3],"|"), " when ",
                                d[i1], "=1")), cex = 1.5)
bp <- with(subset(point.level@data, di3 == 1), 
           boxplot(erreur4.abs ~ gi.cl, xaxt = "n", yaxt = "n", 
                   las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i4] - p[i4],"|"), " when ",
                                d[i3], "=1")), cex = 1.5)
bp <- with(subset(point.level@data, di3 == 1), 
           boxplot(erreur5.abs ~ gi.cl, xaxt = "n", yaxt = "n", 
                   las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i5] - p[i5], "|"), " when ",
                                d[i3], "=1")), cex = 1.5)
par(op)

4th row of the figure:

op <- par(mfrow = c(1,5), oma = c(0, 0, 0, 0), mar = c(1.1, 2.0, 1.8, 0),
          mai = c(0.05, 0.1, 0.05, 0.0), omi = c(0, 0.5, 0, 0))
bp <- with(subset(point.level@data, di4 == 1), 
           boxplot(erreur1.abs ~ gi.cl, xaxt = "n", las = 1, 
                   ylim = c(0, 1.1), cex.axis = 1.5))
mtext("pastures observed", side = 2, line = 3, cex = 1.3, font = 2)
text(2.5, 1.1, expression(paste(group("|", d[i1] - p[i1], "|"), " when ",
                                d[i4], "=1")), cex = 1.5)
axis(1, at = 1:8, labels = FALSE)
bp <- with(subset(point.level@data, di4 == 1), 
           boxplot(erreur2.abs ~ gi.cl, xaxt = "n", yaxt = "n", 
                   las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i2] - p[i2],"|"), " when ",
                                d[i4], "=1")), cex = 1.5)
bp <- with(subset(point.level@data, di4 == 1), 
           boxplot(erreur3.abs ~ gi.cl, xaxt = "n", 
                   yaxt = "n", las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i3] - p[i3], "|"), " when ",
                                d[i4], "=1")), cex = 1.5)
bp <- with(subset(point.level@data, di4 == 1), 
           boxplot(erreur4.abs ~ gi.cl, xaxt = "n", yaxt = "n", 
                   las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i4] - p[i4],"|"), " when ",
                                d[i4], "=1")), cex = 1.5)
bp <- with(subset(point.level@data, di4 == 1), 
           boxplot(erreur5.abs ~ gi.cl, xaxt = "n", yaxt = "n", 
                   las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:8, labels = FALSE)
text(2.5, 1.1, expression(paste(group("|", d[i5] - p[i5],"|"), " when ",
                                d[i4], "=1")), cex = 1.5)
par(op)

5th row of the figure:

op <- par(mfrow = c(1, 5), oma = c(0, 0, 0, 0), mar = c(1.8, 2.0, 1.8, 0),
          mai = c(0.7, 0.1, 0.05, 0.0), omi = c(0.15, 0.5, 0, 0))
bp <- with(subset(point.level@data, di5 == 1), 
           boxplot(erreur1.abs ~ gi.cl, xaxt = "n", las = 1, 
                   ylim = c(0, 1.1), cex.axis = 1.5))
mtext("natural land observed", side = 2, line = 3, cex = 1.3, font = 2)
text(1:9, -0.09, labels = levels(point.level@data$gi.cl), srt = 30, adj = 1,
     xpd = TRUE, cex = 1.7)
text(2.5, 1.1, expression(paste(group("|", d[i1] - p[i1], "|"), " when ",
                                d[i5], "=1")), cex = 1.5)
axis(1, at = 1:9, labels = FALSE)
bp <- with(subset(point.level@data, di5 == 1), 
           boxplot(erreur2.abs ~ gi.cl, xaxt = "n", 
                   yaxt = "n", las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:9, labels = FALSE)
text(1:9, -0.09, labels = levels(point.level@data$gi.cl), srt = 30, adj= 1,
     xpd = TRUE, cex=1.7)
text(2.5, 1.1, expression(paste(group("|", d[i2] - p[i2],"|"), " when ",
                                d[i5], "=1")), cex = 1.5)
bp <- with(subset(point.level@data, di5 == 1), 
           boxplot(erreur3.abs ~ gi.cl, xaxt = "n", yaxt = "n",
                   las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:9, labels = FALSE)
text(1:9, -0.09, labels = levels(point.level@data$gi.cl),
     srt = 30, adj= 1, xpd = TRUE, cex=1.7)
text(2.5, 1.1, expression(paste(group("|", d[i3] - p[i3], "|"), " when ",
                                d[i5], "=1")), cex = 1.5)
bp <- with(subset(point.level@data, di5 == 1), 
           boxplot(erreur4.abs ~ gi.cl, xaxt = "n", yaxt = "n",
                   las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:9, labels = FALSE)
text(1:9, -0.09, labels = levels(point.level@data$gi.cl), srt = 30, adj= 1,
     xpd = TRUE, cex=1.7)
text(2.5, 1.1, expression(paste(group("|", d[i4] - p[i4], "|"), " when ",
                                d[i5], "=1")), cex = 1.5)
bp <- with(subset(point.level@data, di5 == 1), 
           boxplot(erreur5.abs ~ gi.cl, xaxt = "n", yaxt = "n", 
                   las = 1, ylim = c(0, 1.1)))
axis(2, labels = FALSE)
axis(1, at = 1:9, labels = FALSE)
text(1:9, -0.09, labels = levels(point.level@data$gi.cl), srt = 30, adj= 1,
     xpd = TRUE, cex=1.7)
text(2.5, 1.1, expression(paste(group("|", d[i5] - p[i5],"|"), " when ",
                                d[i5], "=1")), cex = 1.5)
par(op)
rm(bp)

4.0.18 Table 4

Table 4: Descriptive statistics of the two types of absolute response error at areal level

Step 1) Average point level absolute response error:

round(summary(segment.level@data$erreur1.abs), 2)
round(summary(segment.level@data$erreur2.abs), 2)
round(summary(segment.level@data$erreur3.abs), 2)
round(summary(segment.level@data$erreur4.abs), 2)
round(summary(segment.level@data$erreur5.abs), 2)

Step 2) Aggregated absolute response error:

round(summary(segment.level@data$erreur1.abs.areal), 2)
round(summary(segment.level@data$erreur2.abs.areal), 2)
round(summary(segment.level@data$erreur3.abs.areal), 2)
round(summary(segment.level@data$erreur4.abs.areal), 2)
round(summary(segment.level@data$erreur5.abs.areal), 2)

4.0.19 Figure 13

bk <- c(0, 0.1, 0.2, 0.3, 0.4, 0.5)
op <- par(mfrow = c(2, 3), mar = c(0, 0, 3, 0))
plot(segment.level, border = "gray",
     col = blue.pal[findInterval(segment.level@data$erreur1.abs, 
                                 bk, all.inside = T)])
title(expression(paste("Urban: ", sum(group("|", d[i1]-p[i1],"|"), i %in% Gg),
                       " /#", Gg)), cex.main = 2)
plot(segment.level, border = "gray", 
     col = blue.pal[findInterval(segment.level@data$erreur2.abs,
                                 bk, all.inside = T)])
title(expression(paste("Farming: ", sum(group("|", d[i2]-p[i2],"|"), i %in% Gg),
                       " /#", Gg)), cex.main = 2)
plot(segment.level, border = "gray", 
     col = blue.pal[findInterval(segment.level@data$erreur3.abs,
                                 bk, all.inside = T)])
title(expression(paste("Forests: ", sum(group("|",d[i3]-p[i3],"|"), i %in% Gg),
                       " /#", Gg)), cex.main = 2)
plot(segment.level, border = "gray", 
     col = blue.pal[findInterval(segment.level@data$erreur4.abs,
                                 bk, all.inside = T)])
title(expression(paste("Pastures: ", sum(group("|",d[i4]-p[i4],"|"), i %in% Gg),
                       " /#", Gg)), cex.main = 2)
plot(segment.level, border = "gray", 
     col = blue.pal[findInterval(segment.level@data$erreur5.abs,
                                 bk, all.inside = T)])
title(expression(paste("Natural land: ", sum(group("|", d[i5]-p[i5],"|"),
                                             i %in% Gg), " /#", Gg)), 
      cex.main = 2)
plot(rep(1, 5), rev(seq(2, 4, 0.5)), axes = FALSE, xlim = c(0, 3), ylim = c(1,5),
     pch = 15, cex = 3, col = blue.pal, xlab = "", ylab = "")
text(1.8, 4.5, "Average point level absolute response error", cex = 1.5)
text(rep(1.2, 5), rev(seq(2, 4, 0.5)), c("[0-0.1[", "[0.1-0.2[", "[0.2-0.3[",
                                         "[0.3-0.4[", "[0.4-0.5]"), 
     pos = 4, offset = 0.2, cex = 1.5)
par(op)
rm(bk)

Figure 13: Average point level absolute response error.

4.0.20 Figure 14

b1 <- round(classIntervals(segment.level@data$erreur1.abs.areal, n = 5,
                           style = "kmeans")$brks, digits = 2)
b2 <- round(classIntervals(segment.level@data$erreur2.abs.areal, n = 5,
                           style = "kmeans")$brks, digits = 2)
b3 <- round(classIntervals(segment.level@data$erreur3.abs.areal, n = 5,
                           style = "kmeans")$brks, digits = 2)
b4 <- round(classIntervals(segment.level@data$erreur4.abs.areal, n = 5,
                           style = "kmeans")$brks, digits = 2)
b5 <- round(classIntervals(segment.level@data$erreur5.abs.areal, n = 5,
                           style = "kmeans")$brks, digits = 2)
bk <- 1/5*(b1 + b2 + b3 + b4 + b5)
bk[6] <- max(b1[6], b2[6], b3[6], b4[6], b5[6])

op <- par(mfrow = c(2, 3), mar = c(0, 0, 3, 0))
plot(segment.level, border = "gray",
     col = blue.pal[findInterval(segment.level@data$erreur1.abs.areal,
                                 bk, all.inside = T)])
title(expression(paste("Urban: ", group("|", bar(d)[g1] - bar(p)[g1], "|"))),
      cex.main = 2)
plot(segment.level, border = "gray",
     col = blue.pal[findInterval(segment.level@data$erreur2.abs.areal, 
                                 bk, all.inside = T)])
title(expression(paste("Farming: ", group("|", bar(d)[g2] - bar(p)[g2], "|"))),
      cex.main = 2)
plot(segment.level, border = "gray", 
     col = blue.pal[findInterval(segment.level@data$erreur3.abs.areal,
                                 bk, all.inside = T)])
title(expression(paste("Forests: ", group("|", bar(d)[g3] - bar(p)[g3], "|"))),
      cex.main = 2)
plot(segment.level, border = "gray", 
     col = blue.pal[findInterval(segment.level@data$erreur4.abs.areal, 
                                 bk, all.inside = T)])
title(expression(paste("Pastures: ", group("|", bar(d)[g4] - bar(p)[g4], "|"))),
      cex.main = 2)
plot(segment.level, border = "gray", 
     col = blue.pal[findInterval(segment.level@data$erreur5.abs.areal,
                                 bk, all.inside = T)])
title(expression(paste("Natural land: ", group("|", bar(d)[g5]-bar(p)[g5], "|"))),
      cex.main = 2)
plot(rep(1, 5), rev(seq(2, 4, 0.5)), axes = FALSE, xlim = c(0, 3), ylim = c(1, 5),
     pch = 15, cex = 3, col = blue.pal, xlab = "", ylab = "")
text(1.8, 4.5, "Areal level absolute response error", cex = 1.5)
text(rep(1.2, 5), rev(seq(2, 4, 0.5)), c("[0-0.01[", "[0.01-0.18[",
                                         "[0.18-0.032[", "[0.032-0.05[", 
                                         "[0.05-0.15]"), 
     pos = 4, offset = 0.2, cex = 1.5)
par(op)
rm(b1, b2, b3, b4, b5, bk)

Figure 14: Areal level absolute response error.

4.0.21 Figure NOT included in this article but in Chakir et al (2016c)

``In Chakir et al. (2016b), the CDFET for the areal level absolute response error is compared with the CDFET for the average point level response error. The comparison confirms once more that the areal level response error is very small and for all land uses.’’

CDFET1_agg <- ecdf(segment.level@data$erreur1.abs)
CDFET2_agg <- ecdf(segment.level@data$erreur2.abs)
CDFET3_agg <- ecdf(segment.level@data$erreur3.abs)
CDFET4_agg <- ecdf(segment.level@data$erreur4.abs)
CDFET5_agg <- ecdf(segment.level@data$erreur5.abs)
CDFET1 <- ecdf(segment.level@data$erreur1.abs.areal)
CDFET2 <- ecdf(segment.level@data$erreur2.abs.areal)
CDFET3 <- ecdf(segment.level@data$erreur3.abs.areal)
CDFET4 <- ecdf(segment.level@data$erreur4.abs.areal)
CDFET5 <- ecdf(segment.level@data$erreur5.abs.areal)

op <- par(mfrow = c(1, 2), mar = c(5, 5, 0.5, 0.5))
plot(CDFET1_agg, verticals = T, col = "red", main = "", cex = 0.8,
     xlab = expression(paste("Average point level absolute response error: 1/#", 
                             Gg, sum(group("|",d[ik]-p[ik],"|"), i %in% Gg))), 
     xlim = c(0, 0.5), ylab = "CDFET")
plot(CDFET2_agg, verticals = T, col = "yellow", main = "", xlim = c(0, 0.5),
     cex = 0.8, add = T)
plot(CDFET3_agg, verticals = T, col = "darkgreen", main = "", xlim = c(0, 0.5),
     cex = 0.8, add = T)
plot(CDFET4_agg, verticals = T, col = "palegreen", main = "", xlim = c(0, 0.5),
     cex = 0.8, add = T)
plot(CDFET5_agg, verticals = T, col = "grey", main = "", xlim = c(0, 0.5),
     cex = 0.8, add = T)
plot(CDFET1, verticals = T, col = "red", main = "", cex = 0.8,
     xlab = expression(paste("Areal level absolute response error: ", 
                             group("|", paste(bar(d)[gk]-bar(p)[gk]), "|"))), 
     xlim = c(0, 0.5), ylab = "CDFET")
plot(CDFET2, verticals = T, col = "yellow", main = "", xlim = c(0, 0.5),
     cex = 0.8, add = T)
plot(CDFET3, verticals = T, col = "darkgreen", main = "", xlim = c(0, 0.5),
     cex = 0.8, add = T)
plot(CDFET4, verticals = T, col = "palegreen", main = "", xlim = c(0, 0.5),
     cex = 0.8, add = T)
plot(CDFET5, verticals = T, col = "grey", main = "", xlim = c(0, 0.5),
     cex = 0.8, add = T)
legend("right", legend = c("urban", "farming", "forests", "pastures", 
                           "natural land"), lty = 1, 
       col = c("red", "yellow", "darkgreen", "palegreen", "grey"))
par(op)

rm(CDFET1, CDFET2, CDFET3, CDFET4, CDFET5, CDFET1_agg, CDFET2_agg, CDFET3_agg,
 CDFET4_agg, CDFET5_agg)

4.0.22 Figure 15

op <- par(mfrow = c(2, 3), mar = c(5, 5, 3, 0.5))
plot(segment.level@data$gi, segment.level@data$erreur1.abs, xlim = c(0.3, 0.7),
     ylim = c(0, 0.5), xlab = expression(paste("Aggregated impurity index ", 
                                               bar(gs)[g])),
     ylab = expression(paste("1/#", Gg, sum(group("|", d[i1]-p[i1], "|"), 
                                            i %in% Gg))))
panel.smooth(segment.level@data$gi, segment.level@data$erreur1.abs,
             col.smooth = "blue") 
title(expression(paste("Urban: 1/#", Gg, sum(group("|", d[i1]-p[i1],"|"), 
                                             i %in% Gg))),
      cex.main = 1.5)
plot(segment.level@data$gi, segment.level@data$erreur2.abs, xlim = c(0.3, 0.7),
     ylim = c(0, 0.5), xlab = expression(paste("Aggregated impurity index ", 
                                               bar(gs)[g])),
     ylab = expression(paste("1/#", Gg, sum(group("|", d[i2]-p[i2], "|"), 
                                            i %in% Gg))))
panel.smooth(segment.level@data$gi, segment.level@data$erreur2.abs,
             col.smooth = "blue")
title(expression(paste("Farming: 1/#", Gg, sum(group("|", d[i2]-p[i2], "|"),
                                               i %in% Gg))), cex.main = 1.5)
plot(segment.level@data$gi, segment.level@data$erreur3.abs, xlim = c(0.3, 0.7),
     ylim = c(0, 0.5), xlab = expression(paste("Aggregated impurity index ", 
                                               bar(gs)[g])),
     ylab = expression(paste("1/#", Gg, sum(group("|", d[i3]-p[i3],"|"), 
                                            i %in% Gg))))
panel.smooth(segment.level@data$gi, segment.level@data$erreur3.abs,
             col.smooth = "blue")
title(expression(paste("Forests: 1/#", Gg, sum(group("|", d[i3]-p[i3], "|"),
                                               i %in% Gg))), cex.main = 1.5)
plot(segment.level@data$gi, segment.level@data$erreur4.abs, xlim = c(0.3, 0.7),
     ylim = c(0, 0.5), xlab = expression(paste("Aggregated impurity index ", 
                                               bar(gs)[g])),
     ylab = expression(paste("1/#", Gg, sum(group("|", d[i4]-p[i4],"|"), 
                                            i %in% Gg))))
panel.smooth(segment.level@data$gi, segment.level@data$erreur4.abs,
             col.smooth = "blue")
title(expression(paste("Pastures: 1/#", Gg, sum(group("|", d[i4]-p[i4], "|"),
 i %in% Gg))), cex.main = 1.5)
plot(segment.level@data$gi, segment.level@data$erreur5.abs, xlim = c(0.3, 0.7),
     ylim = c(0, 0.5), xlab = expression(paste("Aggregated impurity index ", 
                                               bar(gs)[g])),
     ylab = expression(paste("1/#", Gg, sum(group("|", d[i5]-p[i5], "|"), 
                                            i %in% Gg))))
panel.smooth(segment.level@data$gi, segment.level@data$erreur5.abs,
             col.smooth = "blue")
title(expression(paste("Natural land: 1/#", Gg, sum(group("|", d[i5]-p[i5], "|"),
                                                    i %in% Gg))), cex.main = 1.5)
par(op)

Figure 15: Average point level absolute response error vs aggregated Gini-Simpson impurity index \(\bar{gs}_g\).

4.0.23 Figure 16

op <- par(mfrow = c(2, 3), mar = c(5, 5, 3, 0.5))
plot(segment.level@data$gi, segment.level@data$erreur1.abs.areal, 
     xlim = c(0.3, 0.7), ylim = c(0, 0.15), 
     xlab = expression(paste("Aggregated impurity index ", bar(gs)[g])),
     ylab = expression(group("|", bar(d)[g1]-bar(p)[g1], "|")))
panel.smooth(segment.level@data$gi, segment.level@data$erreur1.abs.areal,
             col.smooth = "blue") 
title(expression(paste("Urban: ", group("|", bar(d)[g1]-bar(p)[g1], "|"))),
      cex.main = 1.5)
plot(segment.level@data$gi, segment.level@data$erreur2.abs.areal, 
     xlim = c(0.3, 0.7), ylim = c(0, 0.15), 
     xlab = expression(paste("Aggregated impurity index ", bar(gs)[g])),
     ylab = expression(group("|", bar(d)[g2]-bar(p)[g2], "|")))
panel.smooth(segment.level@data$gi, segment.level@data$erreur2.abs.areal,
             col.smooth = "blue")
title(expression(paste("Farming: ", group("|", bar(d)[g2]-bar(p)[g2], "|"))),
      cex.main = 1.5)
plot(segment.level@data$gi, segment.level@data$erreur3.abs.areal, 
     xlim = c(0.3, 0.7), ylim = c(0, 0.15), 
     xlab = expression(paste("Aggregated impurity index ", bar(gs)[g])),
     ylab = expression(group("|", bar(d)[g3]-bar(p)[g3], "|")))
panel.smooth(segment.level@data$gi, segment.level@data$erreur3.abs.areal,
             col.smooth = "blue")
title(expression(paste("Forests: ", group("|", bar(d)[g3]-bar(p)[g3], "|"))),
      cex.main = 1.5)
plot(segment.level@data$gi, segment.level@data$erreur4.abs.areal, 
     xlim = c(0.3, 0.7), ylim = c(0, 0.15), 
     xlab = expression(paste("Aggregated impurity index ", bar(gs)[g])),
     ylab = expression(group("|", bar(d)[g4]-bar(p)[g4], "|")))
panel.smooth(segment.level@data$gi, segment.level@data$erreur4.abs.areal,
             col.smooth = "blue")
title(expression(paste("Pastures: ", group("|", bar(d)[g4]-bar(p)[g4], "|"))),
      cex.main = 1.5)
plot(segment.level@data$gi, segment.level@data$erreur5.abs.areal, 
     xlim = c(0.3, 0.7), ylim = c(0, 0.15), 
     xlab = expression(paste("Aggregated impurity index ", bar(gs)[g])),
     ylab = expression(group("|", bar(d)[g5]-bar(p)[g5], "|")))
panel.smooth(segment.level@data$gi,segment.level@data$erreur5.abs.areal,
             col.smooth = "blue")
title(expression(paste("Natural land: ", 
                       group("|", bar(d)[g5]-bar(p)[g5], "|"))),
      cex.main = 1.5)
par(op)

Figure 16: Areal absolute response error vs aggregated Gini-Simpson impurity index \(\bar{gs}_g\).

4.0.24 Annex

Simulation by multinomial random draw with probas pi1, pi2, pi3, pi4, pi5 and creation of new variables: di1, di2, di3, di4, di5 (and then Ui).

Step 1) we identify the leafs of the tree :

leaf.tree <- unique(point.level@data[, c("pi1", "pi2", "pi3", "pi4", "pi5")])

Step 2) for each point, we give a number of leaf between 1 and the number of leaf :

res.tree <-  vector("integer", nrow(point.level))
for(k in 1:nrow(leaf.tree))
  res.tree[point.level@data[, "pi1"] == leaf.tree[k, "pi1"]] <- k

Step 3) simulation :

res.sim <- matrix(0, nrow(point.level), 5)
for(k in 1:nrow(leaf.tree))
  res.sim[res.tree == k, ] <- 
  t(rmultinom(length(which(point.level@data[, "pi1"] == leaf.tree$pi1[k])),
              size = 1, prob = leaf.tree[k, c("pi1", "pi2", "pi3", "pi4", "pi5")]))

Step 4) we obtain Ui:

Ui <- apply(res.sim, 1, which.max)